A SHARP ANALYSIS OF THE MIXING TIME FOR 
RANDOM WALK ON ROOTED TREES 



JASON FULMAN 

Abstract. We define an analog of Plancherel measure for the set of 
rooted unlabeled trees on n vertices, and a Markov chain which has 
this measure as its stationary distribution. Using the combinatorics of 
commutation relations, we show that order n 2 steps are necessary and 
suffice for convergence to the stationary distribution. 



1. Introduction 

The Plancherel measure of the symmetric group is a probability measure 
on the irreducible representations of the symmetric group which chooses 
a representation with probability proportional to the square of its dimen- 
sion. Equivalently, the irreducible representations of the symmetric group 
are parameterized by partitions A of n, and the Plancherel measure chooses 
a partition A with probability 

n\ 

where the product is over boxes in the partition and h{x) is the hooklength 
of a box. The hooklength of a box x is defined as 1 + number of boxes in 
same row as x and to right of x + number of boxes in same column of x and 
below x. For example we have filled in each box in the partition of 7 below 
with its hooklength 

as 



and the Plancherel measure would choose this partition with probability 
?^4^2p ■ There has been significant interest in the statistical properties of 
partitions chosen from Plancherel measure of the symmetric group; for this 
the reader can consult 0, 01, [H and the many references therein. 



In this paper we define a similar measure on the set of rooted, unlabeled 
trees on n vertices. We place the root vertex on top, and the four rooted 
trees on 4 vertices are depicted below: 
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This measure chooses a rooted tree with probability 
(2) 7f(t) " \SG(t)\U v& h(v)^ 

where h(v) is the size of the subtree with root v, and |SG(f)| is a certain 
symmetry factor associated to the tree t (precise definitions are given in 
ISection "3|) . We do not know that this measure has applications similar to the 
Plancherel measure of the symmetric group, but the resemblance is striking. 
Moreover, there are Hopf algebras in the physics literature whose generators 
are rooted trees (Kreimer's Hopf algebra 0],[2l|] a Hopf algebra of Connes 
and Moscovici [101 ] . and a Hopf algebra of Grossman and Larson [10]), and 
as a paper of Hoffman [l^] makes clear, the combinatorics of these Hopf 
algebras is very close to the combinatorics we use in this paper. 

In fact the main object we study is a Markov chain K which has it as 
its stationary distribution; this Markov chain is defined in ISection "31 and 
involves removing a single terminal vertex and reattaching it. There are 
several ways of quantifying the convergence rate of a Markov chain on a 
state space X to its stationary distribution; we use the maximal separation 
distance after r steps, defined as 



(r) := max 

x,y&X 



1 K r (x,y) 
7r(y) 



where K r (x,y) is the chance of transitioning from x to y after r steps. 
In general it can be quite tricky even to determine which x,y attain the 
maximum in the definition of s*(r). We do this, and prove that for c > 
fixed, 



lim s*(cn 2 ) = V ^ (2i - l)(i + - 2)e 



■ci(i— 1) 



There are very few Markov chains for which such precise asymptotics are 
known. Our proof method uses a commutation relation of a growth and 
pruning operator on rooted trees (due to Hoffman [HI]), a formula for the 
eigenvalues of K, and ideas from [15]. Details appear in ISection 41 

We mention that the Markov chain K is very much in the spirit of the 



down-up chains (on the state space of partitions) studied in p , [7|] , [15(] , [17|] , 
[23 | . There are also similarities to certain random walks on phylogenetic 
trees (cladograms) studied in []]], [13], [23j] . Our methods only partly apply 
to these walks (the geometry of the two spaces of trees is different), so this 
will be studied in another work. 

To close the introduction, we mention two reasons why it can be useful to 
understand a Markov chain K whose stationary distribution tt is of interest. 
First, in analogy with Plancherel measure of the symmetric group, one can 
hope to use Stein's method ([13]) or other techniques (@j) to study statistical 
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properties of ir. Second, convergence rates of K can lead to concentration 
inequalities for statistics of 7r [sj . 

2. Background on Markov chains 

We will be concerned with the theory of finite Markov chains. Thus X will 
be a finite set (in our case the set of rooted unlabeled trees on n vertices) 
and K a matrix indexed by X x X whose rows sum to 1. Let it be a 
probability distribution on X such that K is reversible with respect to it; 
this means that ir(x)K(x, y) = ir(y)K(y, x) for all x, y and implies that ir is 
a stationary distribution for the Markov chain corresponding to K (i.e. that 

n( x ) = J2y 7T (y) K (y^ x ) for a11 x )- 

A common way to quantify convergence rates of Markov chains is to use 
separation distance, introduced by Aldous and Diaconis [2] , 0] • They define 
the separation distance of a Markov chain K started at x as 

\ K r ( x , y y 

and the maximal separation distance of the Markov chain K as 

K r (x, y y 



s(r) = max 
y 



s*(r) = max 



1 



7r(y) 

They show that the maximal separation distance has the nice properties: 



^max^ \K r (x,y) - ir{y)\ < s*(r 



2 

y 

• (monotonicity) s*(ri) < s*(r2), r\ > 

• (submultiplicativity) s*(ri + r^) < s* {r{)s* {r'l) 

3. Combinatorics of rooted trees 

For a finite rooted tree t, we let \t\ denote the number of vertices of t; T n 
will be the set of rooted unlabeled trees on n vertices. For example T\ = {•} 
consists of only the root vertex, and the four elements of were depicted 
in the introduction. Letting T n = \T n \ and To = 0, there is a recursion 

Y,T n -X n =x\{{l-X n )- T - 

n>l n>l 

from which one obtains T\ = 1,T% = 1, T3 = 2, T4 = 4, T5 = 9, Tq = 20, etc. 



(see [24| for more information on this sequence). 

A rooted tree can be viewed as a directed graph by directing all edges 
away from the root, and a vertex is called terminal if it has no outgoing 
edge. There is a partial order X on the set T of all finite rooted trees 
defined by letting t be covered by t' exactly when t can be obtained from t' 
by removing a single terminal vertex and the edge into it; we denote this by 
t / t' or t' \ t. 

When t y t', one can define two quantities 

n(t, t ) = [vertices of t to which a new edge can be added to get t | 
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and 

m(t,t') = |edges of t' which when removed give t|. 
These need not be equal, as can be seen by taking t, t' to be: 



! A 



Then n(t,t') = 1 and m(t,t') = 2. 

Let CT n denote the complex vector space with basis the elements of T n . 
For n > 1, Hoffman [lj| defines a growth operator G : CT n CT n+ i by 

G(t)= £n(t, «')<', 

t'\t 

and for n > 2 a pruning operator P : CT n C7^_i by 

P{t) = m{t',t)t'. 
t'/t 

One sets P(m) = 0. 

One can extend the definitions of m(t, t') and n(t, t') to any pair of rooted 
trees t,t' with \t'\ — \t\ = k > by setting 

G k (t)= Yl n ^ t ') t> 

\v\=\t\+k 

and 

\t\=\t'\-k 

Since • ^ t for all t, one can think of as the number of ways to 

build up t, and of m(», t) as the number of ways to take t apart by sequen- 
tially removing terminal edges. To simplify notation, we let n{t) = n(», t) 
and m(t) = m(*,t). For example, the reader can check that the four trees 



I A A 



satisfy n(ti) = l,m{t\) = l;n(t2) = 1,^(^2) = 2;n(t3) = 3, m{t$) = 
3;n(t4) = l,m(t4) = 6 respectively. 

There is a "hook-length" type formula for m{t) in the literature. Namely 
if t has n vertices, 

(3) m(t) 



where Wv) is the number of vertices in the subtree with root v; see Section 



22 of [25(] or Exercise 5.1.4-20 of [20(] for a proof. 

As for n(t), it is also known as the "Connes-Moscovici weight" [21]. To 
give a formula for it, we use the concept of the symmetry group SG(t) of a 
tree. For v a vertex of T with children {v\, ■ ■ ■ ,v^}, SG(t,v) is the group 
generated by the permutations that exchange the trees with roots Vi and Vj 
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when they are isomorphic rooted trees; then SG(t) is defined as the direct 
product 

SG(t) = Y\ SG(t,v). 

«6T 



It is proved in [2l| that 

m(t) 



(4) n(t) 



\SG(t)\- 

More generally, Proposition 2.5 of flii ] shows that 

(5) n(s, t)\SG(t)\ = m(s, t)\SG(s)\ 
when |s| < \t\. 

Definition 1 We define a probability measure 7r„ on the set of rooted 
(unlabeled) trees of size n by 

m(t)n(t) n-2™" 1 

(6) 7r n {t) ~ 



nr= 2 $ w)in we *w 

It follows from Proposition 2.8 of that 7r is in fact a probability mea- 
sure (i.e. that the probabilities sum to 1). The second equality in © 
follows from equations ([3]) and The reader can check that the four trees 



are assigned probabilities 1/18, 1/9, 1/2, 1/3 respectively. 

Definition 2 We define upward transition probabilities from t G T n — i to 
i' G T n by 

j G)n(t) G)m(t) 
and downward transition probabilities from t £ T n to t' £ by 

P d (M') = m( "' t)m(t,) 



m(t) 

It is clear from the definitions that the downward transition probabilities 
sum to 1 . The second equality in the definition of P u is from and ([5]) , and 
it follows from Proposition 2.8 of [lj| that the upward transition probabilities 
sum to 1. We define a "down- up" Markov chain with state space T n by 
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composing the down chain with the up chain, i.e. 



K(t,t') = Yl Pd(t,s)P u (s,t') 




s/t,t' 

•s—^ m(s,t)m(s) n(s,t')m(t') 



Thus we deduce the crucial relation 



(7) 



where A is the diagonal matrix which multiplies a tree t by m{t), and P, G 
are the pruning and growth operators. The subscript n indicates that the 
chain is on trees of size n. 

For example, ordering the four elements of T4 as 



from s to t with probability P u (s,t), then t is distributed according to 
the measure ir n . 

(2) Ift is chosen from the measure ir n+ i and one moves from t to s with 
probability Pd(t, s), then s is distributed according to the measure ir n . 

(3) The "down-up" Markov chain K n on rooted trees of size n is re- 
versible with respect to 7r n . 

Proof. For part 1, one calculates that 




one calculates the transition matrix 





lli=2 \2) s/t 

m(t)n(t) 



7T n (t). 
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For part 2, one computes that 

\- ,,n p /, \ \- m(t)n(t) m(s,t)m(s) 

m(s) 



nri 1 ffl 



^-^n(i)m(s,i) 



m(s)n(s) 

= Tr™ = 7r?1 '' s ^ 
llj=2 bJ 

where the last line follows since the upward transition probabilities from s 
sum to 1. 

For part 3, one calculates that 

bJ Hi=2 \2) s /t,t' 

n{t)m{t') , qr , (A] sr^ n(s,t)n(s,t') 

~ ©IEUG) 1 |SG(S)I 

n(t)m(tf) . ^ n(s,t)m(s,lf) 

G) nr= 2 (I) 1 Ul .£j i^oi 

bJ Hi=2 bJ s /t,tf 

= 7r n (t')^(t',t). 

Note that equation ([5]) was used in equalities 2 and 3 and that equation ( jlj) 
was used in the fourth equality. □ 

The final combinatorial fact we will need about rooted trees is the follow- 
ing commutation relation between the growth and pruning operators (Propo- 
sition 2.2 of [19]) : 

(8) PG n - GP n = nl, 

for all n > 1. Here I is the identity operator, so the right hand side multiplies 
a tree by its size. 

4. Proof of main results 

The purpose of this section is to obtain precise asymptotics for the maxi- 
mal separation distance s*(r) of the Markov chain K after r iterations. To 
do this we use equation ((TJ), the commutation relation (jHJ), and the method- 
ology of [TBI]. To begin we determine the eigenvalues of the Markov chain K. 
The multiplicities involve the numbers of rooted unlabeled trees of size i, 
discussed in lSection 31 

Proposition 4.1. The eigenvalues of the Markov chain K are: 
1 multiplicity 1 

1 — j%r multiplicity Ti — Tj_i (3 < i < n) 

\2j 
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Proof. Since K n 



■A^AGP , it suffices to determine the eigenvalues of 

GP n ; these follow from the commutation relation (JSj) and Theorem 2.6 of 

□ 



Recall that our interest is in studying the behavior of 

K r {t,t')~ 



(r) 



max 

t,f 



7r(# 



Proposition 14.21 determines the pairs (t,t ! ) where this maximum is obtained. 

Proposition 4.2. For all values of r, the quantity 1 ttJV) ^ s max i m i ze d 

by letting t be the unique rooted tree with one terminal vertex and t' be the 
unique tree with n — 1 terminal vertices, or by letting t' be the unique rooted 
tree with one terminal vertex and t be the unique tree with n — 1 terminal 
vertices. 

For instance when n = 5 the two relevant trees are 



Proof. By relation ([7|), we seek the t,t' minimizing 

K*{t,t) m(t')(GPy n [t,t>] 



Tr(i') ( n 2 ) r m(t)Tr(t') 
By the commutation relation ([8]) and Proposition 4.5 of [1 

n 

(GPy n = J2Mr,k)G k P!: 

k=0 

where the A n (r, k) solve the recurrence 

A n (r,k)=A n (r-l,k-l)+A n {r-l,k) Q 



n — k 
2 



with initial conditions A n (0, 0) = 1, A n (0, m) = for ra/0. Thus 
K r (t,t') _ m(t')J2 n k=0 Mr,k)G k P k [t,t / } 



(9) 



7r(f) Q r m(tMt>) 



The proposition now follows from three observations: 

• All terms in are non-negative. Indeed, this is clear from the 
recurrence for A n (r, k) . 

• If t is the unique rooted tree with one terminal vertex and t' is the 
unique rooted tree with n — 1 terminal vertices (or the same holds 
with t, t! swapped), then the summands in Q for < k < n — 3 all 
vanish. Indeed, in order to move from t to t' by pruning k vertices 
and then reattaching them, one must prune at least n — 2 vertices. 
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• The k = n — 2 and k = n — 1 summands in ([9]) are independent of 
t,t'. Indeed, for the k = n — 1 summand, one has that 

m(t')A n {r, n - 1) G n ~ 1 P™~ l [t, f] 

(S) r ro(*M*0 
m(t , )A n (r,n-l)G n - 1 [;t'} 

WW 

m(t')A n (r,n- l)n(i') 

" WW) 

= A n (r,n-l)I%©. 

{2) 

A similar argument shows that the k = n — 2 summand is equal to 
A n (r,n-2)^gP-. 

□ 

Remark: The proof of Proposition 14.21 shows that 

s * (r) = i _ }di0*L [An(T) n _ 2) + An(rj n _ i)] , 

where A n (r,k) is the solution to the recurrence in the proof of Proposition 
K2\ 



In Theorem 14.31 we gi ye an explicit formula for s*(r) and determine its 
asymptotic behavior. 

Theorem 4.3. Let s*(r) be the maximal separation distance after r iter- 
ations of the down-up Markov chain K on the space of rooted trees on n 
vertices. 

(1) For r > 1, s*(r) is equal to 

< . 1 (2»-l)(t + l)(i-2)(nl) a / _ m r 
^ l J 2n(n-i)!(n + i-l)! ^ (£) J ' 

(2) For c> /ked, 

lim s*(cn 2 ) = Y ^— ^ (2i - l)(t + l)(i - 2)e" ci(i - 1) . 

Proof. By Proposition ^. 21 the maximal separation distance is attained when 
t is the unique rooted tree with one terminal vertex and t' is the unique 
rooted tree with n — 1 terminal vertices. Note that it takes n — 2 iterations 
of the Markov chain K to move from t to t'. By Proposition 14. 11 has n — 1 
distinct eigenvalues (one more than the Markov chain distance between i 
and t'), so it follows from Proposition 5.1 of [16] that 



(10) s*(r) = Y,K 



i=3 



n 1 ~ Xj 

3^ 
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where 1, Aj = 1 — 4|4 3 % = 3, • • • , n are the distinct eigenvalues of K. For 

\2) 

r > 1, this is equal to 



(11) 



and the first assertion follows by elementary simplifications. 

For part 2 of the theorem, it is enough to show that for c > fixed, 
there is a constant i c (depending on c but not on n) such that for i > i c , 
the summands in part 1 of the theorem are decreasing in magnitude (and 
alternating in sign). Part 2 follows from this claim, since then one can take 
limits for each fixed i. For i > one checks that 

(2i-l)(i + l)(i-2)(n!) 2 
2n(n - i)\(n + i - 1)! 

is a decreasing function of i. To handle the case of i < 2yjn, one need only 
show that 

(n - i)(2z + l)(z + 2)(i - 1) exp(cn 2 log(l - (*+*) /(%))) 
1 ' (n + i)(2i-l)(i + l)(*-2) exp(cn 2 log(l-Q)/Q))) 

for i > i c , a constant depending on c but not on n. This is easily established, 
since using the inequalities log(l — x) < —x for x > in the numerator and 
log(l — x) > —x — x 2 for < x < \ in the denominator gives that 



exp(cn 2 log(l-C+ 1 )/(2))) < 

< exp 



-cn 2 /. (*)' 



exp(cn 2 log(l-Q)/(^))) 

and P^|) follows as i < 2-^. □ 

Some authors who work on Markov chains similar to that studied here but 
on different state spaces (e.g. [3], [13]) prefer to work with up-down chains 
instead of down-up chains. Proposition 14.41 shows the study of maximal 
separation for these two chains to be equivalent. 

Proposition 4.4. Let Sy D (r) denote the maximal separation distance after 
r iterations of the down-up chain on T n , and let s* DUn {r) be the corresponding 
quantity for the up-down chain. Then 

s DU n ( r ) = s UD n+ A r + l ) 

for all n,r > 1. 

Proof. An argument similar to that used to prove equation ([7]) gives that 
(13) DU n = -^APGA- 1 
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where A is the diagonal matrix which multiplies a tree by m(t), and P, G are 
the pruning and growth operators. Combining this with the commutation 
relation (JSj) , it follows that 

(DU) r n = 1 ^ w [A{nI + GP)A- 1 ^ 

2 



i—n \ / 



(n+iy z 

V 2 J 1=0 

Arguing as in Proposition 14.21 one concludes that the same t, t' maximize 
the separation distance. Moreover, one sees from (|13p . commutation relation 
([8]) , and Proposition 14.11 that the distinct eigenvalues of the up-down chain 

(*) 

on trees of size n are 1 and fa = 1 — /n+i\ ; i = 3, • • • , n. Thus the argument 

\ 2 J 

of Theorem 14.31 gives that 

i=3 \ 

The proposition now follows by making the replacements r — * r + 1 and 
n — > n + 1 in the left hand side of equation (jlip . □ 

To close, we note the following probabilistic interpretation of s*(r). We use 
the convention that a random variable X is called geometric with parameter 
(probability of success) p if P(X = n) = p(l — p) n ~ 1 for all n > 1. 




Proposition 4.5. Letting s*(r) be as in Theorem \4-3\ one has that s*(r) = 
P(T > r), where T = ^™ =3 ^Q, and the Xi's are independent geometries 

n 

with parameters 42f. 



Proof. This is immediate from equation (llOj) and Proposition 2.4 of la]. □ 



We remark that representations of separation distance similar to that in 
Proposition 14.51 are in the literature for stochastically monotone birth-death 
chains with non- negative eigenvalues ([121] . [131 ]) and for some random walks 
on partitions [15(. Of course the Markov chain K studied in this paper is 
not one-dimensional. 
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